Review data is often used by consumers to make many decisions, such as where they should eat and what products they should purchase. It is also used by businesses for purposes such as getting feedback and comparing themselves with their competitors. When using this data, it is important to understand trends in the data so you can more accurately make decisions based on what you see. I looked at Yelp data spanning from 2005 to 2022 to see what trends I could find. I analyzed the sentiment of around 50,000 reviews and found expected insights such as reviews that registered as more positive tended to be associated with higher star ratings. I also found less intuitive insights like reviews with lower star ratings tended to be more neutral, with lower variability in their tone compared with high starred reviews, which tended to be more polarized, but with higher variability. Finally, I selected a few categories of businesses and and explored how their star ratings and tone compared to each other and varied over time.
# # Extract data from tarfile
# with tarfile.open("yelp_dataset.tgz", "r") as all_data:
# # Extract all members of the archive
# all_data.extractall(filter="tar")
# # Open and read the review JSON file
# review_rows = []
# with open("yelp_academic_dataset_review.json", "r", encoding="utf-8") as file:
# for line in file:
# # Load each line as a separate JSON object (row)
# row = json.loads(line)
# review_rows.append(row)
# # Convert the list of rows into a pandas DataFrame
# review = pd.DataFrame(review_rows)
# # Open and read the business JSON file
# business_rows = []
# with open(
# "yelp_academic_dataset_business.json", "r", encoding="utf-8"
# ) as file:
# for line in file:
# # Load each line as a separate JSON object (row)
# row = json.loads(line)
# business_rows.append(row)
# # Convert the list of rows into a pandas DataFrame
# business = pd.DataFrame(business_rows)
# # Take a sample of the businesses
# business_sample = business.sample(n=1000, random_state=1)
# # Combine the two datasets
# combined = business_sample.merge(
# review, how="inner", on="business_id", suffixes=("_business", "_review")
# )
# # Convert the combination into a csv
# combined.to_csv("yelp_combined.csv", index=False)
The Yelp data contained many data points, which made running any code take longer than reasonable. Because of this, I decided to take a random sample of 1,000 business from the business dataset and join it with the review dataset, so I had all the reviews for each business in my random sample.
# Import libraries
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from IPython.display import display
from wordcloud import WordCloud
from textblob import TextBlob
import ipywidgets as widgets
import scipy.stats as stats
import scikit_posthocs as sp
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.stats.multicomp import pairwise_tukeyhsd
import json
from itertools import combinations
import statsmodels.stats.multitest as smm
%matplotlib inline
pio.templates.default = "plotly_white"
I then did a bit of feature engineering by creating a new variable that represented the length of each review. I also calculated the polarity, how negative (-1) or positive (1) the text is, and subjectivity, how factual (0) or opinion-based (1) the text is, of each review using the TextBlob library. I also created a new variable, Abs-polarity, which measured the strength of the polarity by taking the absolute value of the polarity.
# Download data
combined = pd.read_csv("yelp_combined.csv", encoding="utf-8")
# # Calculating the length of each review
# combined["Review Length"] = combined["text"].apply(len)
# # Loop through each row and calculate sentiments
# for index, text in combined["text"].items():
# blob = TextBlob(text)
# combined.at[index, "Polarity"] = blob.sentiment.polarity
# combined.at[index, "Subjectivity"] = blob.sentiment.subjectivity
# # Calculate polarity strength
# combined["Abs_polarity"] = abs(combined["Polarity"])
Below are the first five rows of the dataset as well as some preliminary statistics. I also checked for null values. The only variables that had null values were address, attributes, and hours, none of which I use in this analysis.
pd.set_option("display.max_colwidth", 10)
# Display the first five rows
display(combined.head())
# Show some preliminary statistics
display(combined.describe())
# Check for null values
print("Null Values")
display(combined.isnull().sum())
| business_id | name | address | city | state | postal_code | latitude | longitude | stars_business | review_count | ... | funny | cool | text | date | Review Length | Star Sentiment | Polarity | Subjectivity | Abs_polarity | Sentiment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | L_f14M... | Luca I... | 100 2n... | Saint ... | FL | 33701 | 27.7733 | -82.633806 | 5.0 | 5 | ... | 1 | 1 | This n... | 2016-03... | 629 | Positive | 0.347708 | 0.678889 | 0.347708 | Positive |
| 1 | L_f14M... | Luca I... | 100 2n... | Saint ... | FL | 33701 | 27.7733 | -82.633806 | 5.0 | 5 | ... | 0 | 0 | Phenom... | 2021-03... | 85 | Positive | 0.609375 | 0.716667 | 0.609375 | Positive |
| 2 | L_f14M... | Luca I... | 100 2n... | Saint ... | FL | 33701 | 27.7733 | -82.633806 | 5.0 | 5 | ... | 0 | 0 | Great ... | 2020-02... | 233 | Positive | 0.390000 | 0.578750 | 0.390000 | Positive |
| 3 | L_f14M... | Luca I... | 100 2n... | Saint ... | FL | 33701 | 27.7733 | -82.633806 | 5.0 | 5 | ... | 0 | 0 | My hus... | 2021-02... | 312 | Positive | 0.215260 | 0.558157 | 0.215260 | Positive |
| 4 | L_f14M... | Luca I... | 100 2n... | Saint ... | FL | 33701 | 27.7733 | -82.633806 | 5.0 | 5 | ... | 0 | 0 | Gorgeo... | 2021-03... | 155 | Positive | 0.700000 | 0.950000 | 0.700000 | Positive |
5 rows × 28 columns
| latitude | longitude | stars_business | review_count | is_open | stars_review | useful | funny | cool | date | Review Length | Polarity | Subjectivity | Abs_polarity | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 50363.... | 50363.... | 50363.... | 50363.... | 50363.... | 50363.... | 50363.... | 50363.... | 50363.... | 50363 | 50363.... | 50363.... | 50363.... | 50363.... |
| mean | 35.559602 | -90.478860 | 3.786351 | 277.43657 | 0.817823 | 3.785477 | 1.172508 | 0.325139 | 0.506106 | 2016-1... | 570.64... | 0.248511 | 0.564448 | 0.288057 |
| min | 27.688229 | -119.88... | 1.000000 | 5.00000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 2005-0... | 17.000000 | -1.000000 | 0.000000 | 0.000000 |
| 25% | 29.938914 | -90.337143 | 3.500000 | 48.00000 | 1.000000 | 3.000000 | 0.000000 | 0.000000 | 0.000000 | 2014-1... | 230.00... | 0.109670 | 0.484946 | 0.143234 |
| 50% | 36.325741 | -86.188308 | 4.000000 | 152.00000 | 1.000000 | 4.000000 | 0.000000 | 0.000000 | 0.000000 | 2017-0... | 409.00... | 0.254762 | 0.562143 | 0.266667 |
| 75% | 39.910601 | -82.287938 | 4.500000 | 381.00000 | 1.000000 | 5.000000 | 1.000000 | 0.000000 | 0.000000 | 2019-0... | 721.00... | 0.394643 | 0.644444 | 0.400000 |
| max | 53.631919 | -74.700195 | 5.000000 | 1291.0... | 1.000000 | 5.000000 | 132.00... | 77.000000 | 131.00... | 2022-0... | 5000.0... | 1.000000 | 1.000000 | 1.000000 |
| std | 5.491401 | 15.121928 | 0.751343 | 305.53432 | 0.385994 | 1.465665 | 2.812676 | 1.363873 | 2.050777 | NaN | 532.09... | 0.237985 | 0.134091 | 0.188194 |
Null Values
business_id 0 name 0 address 1027 city 0 state 0 postal_code 0 latitude 0 longitude 0 stars_business 0 review_count 0 is_open 0 attributes 1033 categories 0 hours 2667 review_id 0 user_id 0 stars_review 0 useful 0 funny 0 cool 0 text 0 date 0 Review Length 0 Star Sentiment 0 Polarity 0 Subjectivity 0 Abs_polarity 0 Sentiment 0 dtype: int64
To get a sense of the variables I was working with, I first used bar graphs and histograms to get a sense of their distributions. As you can see, the most common star rating that a review would give was a 5. 4 was the next highest, closely followed by 1. Not as many reviews gave a rating of 2 or 3. As one might expect, the distribution of average star rating of a business looks more normal. It’s centered at 4, with more extreme values being less common. Given that 5 is the highest possible rating, the distribution is not symmetric, since the right tail is cut off.
# Bar Plot of the distribution of stars
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
sns.set_style("white")
sns.countplot(data=combined, x="stars_review", color="darkviolet", ax=axes[0])
axes[0].set_title("Distribution of Review Star Ratings")
axes[0].set_xlabel("Review Star Rating")
axes[0].set_ylabel("Count")
axes[0].grid(True, axis="y")
sns.countplot(
data=combined, x="stars_business", color="darkviolet", ax=axes[1]
)
axes[1].set_title("Distribution of Business Star Ratings")
axes[1].set_xlabel("Business Star Rating")
axes[1].set_ylabel("Count")
axes[1].grid(True, axis="y")
plt.tight_layout()
plt.show()
The distributions of the review polarities and subjectivities also appear approximately normal. The polarity appears relatively symmetric and centered around 0.25. The subjectivity appears relatively symmetric and centered around 0.55. Both polarity and subjectivity seem to have centers that are slightly higher than the midpoint of their respective ranges. Polarity trends slightly more positive and subjectivity trends slightly more opinionated than factual. Absolute polarity is not as similar. The most common polarity strength is around 2.25, but there is a clear right skew to the data, with more lower absolute polarities than higher. This means that reviews tended to be more balanced and less strictly positive or negative.
# Histogram of the distribution of Polarity and Subjectivity
fig, axes = plt.subplots(1, 3, figsize=(21, 7))
sns.set_style("white")
sns.histplot(
data=combined, x="Polarity", color="darkviolet", kde=True, ax=axes[0]
)
axes[0].set_title("Distribution of Review Polarity")
axes[0].set_xlabel("Review Polarity")
axes[0].set_ylabel("Count")
axes[0].grid(True, axis="y")
sns.histplot(
data=combined, x="Subjectivity", color="darkviolet", kde=True, ax=axes[1]
)
axes[1].set_title("Distribution of Review Subjectivity")
axes[1].set_xlabel("Review Subjectivity")
axes[1].set_ylabel("Count")
axes[1].grid(True, axis="y")
sns.histplot(
data=combined, x="Abs_polarity", color="darkviolet", kde=True, ax=axes[2]
)
axes[2].set_title("Distribution of Absolute Polarity")
axes[2].set_xlabel("Absolute Polarity")
axes[2].set_ylabel("Count")
axes[2].grid(True, axis="y")
plt.tight_layout()
plt.show()
Each business came with many different categories, some of which were not very distinct (ex. Restaurants vs. Food). In order to have more digestible insights, I narrowed it down to six categories that I thought were both common and distinct from each other. Below is a bar graph of the count of each category. Although the “Restaurant” category has the most data points by far at 35,093 points, the other categories have plenty of data points themselves, with the least frequent, “Hotels & Travel”, having 1,601.
# Convert the 'categories' column into a list
combined["categories"] = combined["categories"].str.split(",")
# Exploding category column to get individual categories and then count them
# Convert the 'categories' column into a list
combined["categories"] = combined["categories"].apply(
lambda x: [category.strip().title() for category in x]
)
categories_to_filter = [
"Restaurants",
"Event Planning & Services",
"Shopping",
"Beauty & Spas",
"Arts & Entertainment",
"Hotels & Travel",
]
combined_filtered = combined[
combined["categories"].apply(
lambda x: any(cat in x for cat in categories_to_filter)
)
]
combined_filtered_exploded = combined_filtered.explode("categories")
combined_filtered_exploded = combined_filtered_exploded[
combined_filtered_exploded["categories"].isin(categories_to_filter)
].reset_index(drop=True)
category_order = combined_filtered_exploded["categories"].value_counts().index
# display(combined_filtered_exploded["categories"].value_counts())
# Create barplot of business categories
ax = sns.countplot(
data=combined_filtered_exploded,
y="categories",
order=category_order,
color="darkviolet",
zorder=10,
)
ax.bar_label(ax.containers[0], rotation=-90)
for spine in ax.spines.values():
spine.set_zorder(2)
for p in ax.patches:
p.set_zorder(1)
plt.title("Distribution of Main Business Categories")
plt.ylabel("Category")
plt.xlabel("Count")
plt.grid(axis="x", zorder=0, lw=0.5)
plt.tight_layout()
plt.show()
Despite the star rating being the intended polarity of a review, sometimes the content of a review doesn’t fully match that. The below graphs explore the relationships between star rating and the polarity, subjectivity, and polarity strength of the reviews.
The overall trend between polarity and star rating is generally as expected: a higher star rating is associated with a higher polarity. It is interesting to note, however, that even for a star rating of 1, the average polarity is only slightly below 0, indicating only a vaguely negative tone.
The relationship between star rating and subjectivity is less pronounced, but also shows a positive relationship. A higher star rating correlates with more subjectivity. That isn’t the only pattern, though. The spread of subjectivities is greater for more extreme star ratings than it is for the middle star ratings. In other words, more extreme star ratings were associated with both higher and lower subjectivities while the middle star ratings tended to be more consistent.
Finally, the relationship between star rating and polarity strength (absolute polarity) is also positive. This is somewhat unexpected because it means that lower star ratings tended to be more neutral than higher star ratings, when you might expect the middle star ratings to be more neutral. However, this is consistent with the first graph, which showed the lower star ratings’ polarities centered around 0, and moving upwards (one direction of more extreme) from there. Additionally, the distribution of polarity strength is more bulbous for star ratings of 1 and 2, and more spread for the higher star ratings. This indicates that for lower star ratings, polarity strength is more consistent while for the higher star ratings, it’s more variable.
# Boxplots of Review Star Ratings with Sentiments
fig, axes = plt.subplots(1, 3, figsize=(21, 7))
# Plot for Abs_polarity
sns.boxplot(
data=combined,
x="stars_review",
y="Abs_polarity",
color="white",
fliersize=1,
linecolor="black",
ax=axes[2],
zorder=0,
)
sns.violinplot(
data=combined,
x="stars_review",
y="Abs_polarity",
color="darkviolet",
ax=axes[2],
alpha=0.25,
inner=None,
zorder=10,
)
axes[2].set_title("Absolute Polarity vs. Star Rating")
axes[2].set_xlabel("Star Rating")
axes[2].set_ylabel("Absolute Polarity")
# Plot for Polarity
sns.boxplot(
data=combined,
x="stars_review",
y="Polarity",
color="white",
fliersize=1,
linecolor="black",
ax=axes[0],
zorder=0,
)
sns.violinplot(
data=combined,
x="stars_review",
y="Polarity",
color="darkviolet",
ax=axes[0],
alpha=0.25,
inner=None,
zorder=10,
)
axes[0].set_title("Polarity vs. Star Rating")
axes[0].set_xlabel("Star Rating")
# Plot for Subjectivity
sns.boxplot(
data=combined,
x="stars_review",
y="Subjectivity",
color="white",
fliersize=1,
linecolor="black",
ax=axes[1],
zorder=0,
)
sns.violinplot(
data=combined,
x="stars_review",
y="Subjectivity",
color="darkviolet",
ax=axes[1],
alpha=0.25,
inner=None,
zorder=10,
)
axes[1].set_title("Subjectivity vs. Star Rating")
axes[1].set_xlabel("Star Rating")
plt.tight_layout()
plt.show()
I was also interested in seeing if the various sentiment metrics differ by business category. The below charts show that them looking fairly similar, but given the large amount of data used, even differences that look minute on the graphs may be statistically significant. Because of this, I also wanted to run statistical tests to better understand if there were differences.
sorted_categories_polarity = (
combined_filtered_exploded.groupby("categories")["Polarity"]
.median()
.sort_values(ascending=True)
.index.to_list()
)
sorted_categories_subjectivity = (
combined_filtered_exploded.groupby("categories")["Subjectivity"]
.median()
.sort_values(ascending=True)
.index.to_list()
)
sorted_categories_abspolarity = (
combined_filtered_exploded.groupby("categories")["Abs_polarity"]
.median()
.sort_values(ascending=True)
.index.to_list()
)
# Boxplots of Business Categories with Sentiments
fig, axes = plt.subplots(1, 3, figsize=(21, 7))
# Plot for Abs_polarity
sns.boxplot(
data=combined_filtered_exploded,
y="categories",
x="Abs_polarity",
order=sorted_categories_abspolarity,
color="white",
fliersize=1,
linecolor="black",
ax=axes[2],
zorder=0,
)
sns.violinplot(
data=combined_filtered_exploded,
y="categories",
x="Abs_polarity",
order=sorted_categories_abspolarity,
color="darkviolet",
ax=axes[2],
alpha=0.25,
inner=None,
zorder=10,
)
axes[2].set_title("Absolute Polarity vs. Business Category")
axes[2].set_ylabel("Business Category")
axes[2].set_xlabel("Absolute Polarity")
# Plot for Polarity
sns.boxplot(
data=combined_filtered_exploded,
y="categories",
x="Polarity",
order=sorted_categories_polarity,
color="white",
fliersize=1,
linecolor="black",
ax=axes[0],
zorder=0,
)
sns.violinplot(
data=combined_filtered_exploded,
y="categories",
x="Polarity",
order=sorted_categories_polarity,
color="darkviolet",
ax=axes[0],
alpha=0.25,
inner=None,
zorder=10,
)
axes[0].set_title("Polarity vs. Business Category")
axes[0].set_ylabel("Business Category")
# Plot for Subjectivity
sns.boxplot(
data=combined_filtered_exploded,
y="categories",
x="Subjectivity",
order=sorted_categories_subjectivity,
color="white",
fliersize=1,
linecolor="black",
ax=axes[1],
zorder=0,
)
sns.violinplot(
data=combined_filtered_exploded,
y="categories",
x="Subjectivity",
order=sorted_categories_subjectivity,
color="darkviolet",
ax=axes[1],
alpha=0.25,
inner=None,
zorder=10,
)
axes[1].set_title("Subjectivity vs. Business Category")
axes[1].set_ylabel("Business Category")
plt.tight_layout()
plt.show()
In order to check if the means of the sentiment metrics differed between business categories, I wanted to run an ANOVA test. However, before doing that, it was important to check the assumptions of the test. Below are the results of those checks. While the residuals didn’t show any significant heteroskedasticity, the QQ plot shows non-normality for the residuals of the tests of all three sentiment metrics. Because of this, I chose to use the Kruskal-Wallis test, and its post-hoc, Dunn’s Test, to compare the medians, since this test does not require the residuals to be normal. To run this test, I assume that each review was written independently of other reviews within the same category. This may not be a perfect assumption, given that people are likely to be influenced by other reviews they see, but it is necessary for this test. I also assume that the reviews from one category are independent of the reviews in other categories. This is a reasonable assumption, since the six categories I chose are unlikely to have much overlap. Finally, in order to use this test to compare the medians rather than the mean ranks, I assume that the distribution shapes of each sentiment metric are the same across business categories. This seems reasonable, given that the shapes of the box plots and violin plots appear similar within each of the three graphs.
warnings.filterwarnings("ignore")
# Test ANOVA Assumptions for Polarity
model1 = smf.ols(
"Polarity ~ C(categories)", data=combined_filtered_exploded
).fit()
model2 = smf.ols(
"Subjectivity ~ C(categories)", data=combined_filtered_exploded
).fit()
model3 = smf.ols(
"Abs_polarity ~ C(categories)", data=combined_filtered_exploded
).fit()
residuals1 = model1.resid
fitted1 = model1.fittedvalues
residuals2 = model2.resid
fitted2 = model2.fittedvalues
residuals3 = model3.resid
fitted3 = model3.fittedvalues
fig, axes = plt.subplots(3, 2, figsize=(12, 12))
sns.residplot(
x=fitted1,
y=residuals1,
lowess=True,
line_kws={"color": "red"},
ax=axes[0, 0],
)
axes[0, 0].set_xlabel("Fitted Values")
axes[0, 0].set_ylabel("Residuals")
axes[0, 0].set_title("Residuals vs. Fitted Values")
sm.qqplot(residuals1, line="s", ax=axes[0, 1])
axes[0, 1].set_title("QQ Plot of Residuals")
sns.residplot(
x=fitted2,
y=residuals2,
lowess=True,
line_kws={"color": "red"},
ax=axes[1, 0],
)
axes[1, 0].set_xlabel("Fitted Values")
axes[1, 0].set_ylabel("Residuals")
axes[1, 0].set_title("Residuals vs. Fitted Values")
sm.qqplot(residuals2, line="s", ax=axes[1, 1])
axes[1, 1].set_title("QQ Plot of Residuals")
sns.residplot(
x=fitted3,
y=residuals3,
lowess=True,
line_kws={"color": "red"},
ax=axes[2, 0],
)
axes[2, 0].set_xlabel("Fitted Values")
axes[2, 0].set_ylabel("Residuals")
axes[2, 0].set_title("Residuals vs. Fitted Values")
sm.qqplot(residuals3, line="s", ax=axes[2, 1])
axes[2, 1].set_title("QQ Plot of Residuals")
plt.suptitle("ANOVA Assumptions", fontsize=16)
fig.text(
0.5, 0.925, "Polarity vs. Business Category", ha="center", fontsize=14
)
fig.text(
0.5, 0.62, "Subjectivity vs. Business Category", ha="center", fontsize=14
)
fig.text(
0.5,
0.31,
"Absolute Polarity vs. Business Category",
ha="center",
fontsize=14,
)
plt.tight_layout(h_pad=3.5)
fig.subplots_adjust(top=0.9)
plt.show()
First I ran the Kruskal-Wallis test for a polarity difference between the business categories. The Kruskal-Wallis statistic was 159.01 with a corresponding p-value of approximately 0. This indicates sufficient evidence to reject the null and conclude that there is a difference in the median polarity between at least two of the business categories. To get a better sense of where there were any differences, I ran the post-hoc Dunn’s test with a Bonferroni correction for multiple testing. Below are the p values of each pair wise combination. Arts & Entertainment was not statistically different from Hotels & Travel or Shopping. Beauty & Spas was not statistically different from Event Planning or Restaurants. Event Planning was not significantly different from Restaurants. Hotels & Travel was not statistically different from Shopping. All other, unmentioned combinations were statistically significantly different.
# Kruskal-Wallis Test for polarity vs. business category
restaurants = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Restaurants"
]["Polarity"]
events = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Event Planning & Services"
]["Polarity"]
shopping = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Shopping"
]["Polarity"]
beauty = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Beauty & Spas"
]["Polarity"]
art = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Arts & Entertainment"
]["Polarity"]
travel = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Hotels & Travel"
]["Polarity"]
result = stats.kruskal(restaurants, events, shopping, beauty, art, travel)
dunn = sp.posthoc_dunn(
combined_filtered_exploded,
val_col="Polarity",
group_col="categories",
p_adjust="bonferroni",
)
print(
"Kruskal Wallis & Dunn's Test Results for Polarity vs. Business Category"
)
print(f"Kruskal-Wallis Statistic: {result.statistic:.2f}")
print(f"P-Value: {result.pvalue:.2e}")
display(dunn.round(4))
Kruskal Wallis & Dunn's Test Results for Polarity vs. Business Category Kruskal-Wallis Statistic: 159.01 P-Value: 1.61e-32
| Arts & Entertainment | Beauty & Spas | Event Planning & Services | Hotels & Travel | Restaurants | Shopping | |
|---|---|---|---|---|---|---|
| Arts & Entertainment | 1.0000 | 0.0249 | 0.0001 | 1.0000 | 0.0 | 0.6696 |
| Beauty & Spas | 0.0249 | 1.0000 | 1.0000 | 0.0027 | 1.0 | 0.0000 |
| Event Planning & Services | 0.0001 | 1.0000 | 1.0000 | 0.0000 | 1.0 | 0.0000 |
| Hotels & Travel | 1.0000 | 0.0027 | 0.0000 | 1.0000 | 0.0 | 1.0000 |
| Restaurants | 0.0000 | 1.0000 | 1.0000 | 0.0000 | 1.0 | 0.0000 |
| Shopping | 0.6696 | 0.0000 | 0.0000 | 1.0000 | 0.0 | 1.0000 |
I ran the same tests as with Polarity with Subjectivity and achieved very similar results. The Kruskal-Wallis statistic was 362.05, corresponding to a p-value of approximately 0. The same pair wise combinations that were significantly different / not significantly different for polarity were significant or not significant for subjectivity.
# Kruskal-Wallis Test for subjectivity vs. business category
restaurants = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Restaurants"
]["Subjectivity"]
events = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Event Planning & Services"
]["Subjectivity"]
shopping = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Shopping"
]["Subjectivity"]
beauty = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Beauty & Spas"
]["Subjectivity"]
art = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Arts & Entertainment"
]["Subjectivity"]
travel = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Hotels & Travel"
]["Subjectivity"]
result = stats.kruskal(restaurants, events, shopping, beauty, art, travel)
dunn = sp.posthoc_dunn(
combined_filtered_exploded,
val_col="Subjectivity",
group_col="categories",
p_adjust="bonferroni",
)
print(
"Kruskal Wallis & Dunn's Test Results for Subjectivity vs. Business Category"
)
print(f"Kruskal-Wallis Statistic: {result.statistic:.2f}")
print(f"P-Value: {result.pvalue:.2e}")
display(dunn.round(4))
Kruskal Wallis & Dunn's Test Results for Subjectivity vs. Business Category Kruskal-Wallis Statistic: 362.05 P-Value: 4.46e-76
| Arts & Entertainment | Beauty & Spas | Event Planning & Services | Hotels & Travel | Restaurants | Shopping | |
|---|---|---|---|---|---|---|
| Arts & Entertainment | 1.0 | 1.0000 | 0.0 | 1.0 | 0.0 | 1.0000 |
| Beauty & Spas | 1.0 | 1.0000 | 0.0 | 1.0 | 0.0 | 0.0239 |
| Event Planning & Services | 0.0 | 0.0000 | 1.0 | 0.0 | 1.0 | 0.0000 |
| Hotels & Travel | 1.0 | 1.0000 | 0.0 | 1.0 | 0.0 | 1.0000 |
| Restaurants | 0.0 | 0.0000 | 1.0 | 0.0 | 1.0 | 0.0000 |
| Shopping | 1.0 | 0.0239 | 0.0 | 1.0 | 0.0 | 1.0000 |
Finally, I also achieved similar results for polarity strength. This is somewhat expected given that polarity strength is just the absolute value of the polarity. The Kruskal-Wallis statistic was 170.14, corresponding to a p-value of approximately 0. The same pair wise combinations that were significantly different / not significantly different for polarity were significant or not significant for polarity strength.
# Kruskal-Wallis Test for polarity vs. business category
restaurants = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Restaurants"
]["Abs_polarity"]
events = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Event Planning & Services"
]["Abs_polarity"]
shopping = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Shopping"
]["Abs_polarity"]
beauty = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Beauty & Spas"
]["Abs_polarity"]
art = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Arts & Entertainment"
]["Abs_polarity"]
travel = combined_filtered_exploded[
combined_filtered_exploded["categories"] == "Hotels & Travel"
]["Abs_polarity"]
result = stats.kruskal(restaurants, events, shopping, beauty, art, travel)
dunn = sp.posthoc_dunn(
combined_filtered_exploded,
val_col="Abs_polarity",
group_col="categories",
p_adjust="bonferroni",
)
print(
"Kruskal Wallis & Dunn's Test Results for Absolute Polarity vs. Business Category"
)
print(f"Kruskal-Wallis Statistic: {result.statistic:.2f}")
print(f"P-Value: {result.pvalue:.2e}")
display(dunn.round(4))
Kruskal Wallis & Dunn's Test Results for Absolute Polarity vs. Business Category Kruskal-Wallis Statistic: 170.14 P-Value: 6.80e-35
| Arts & Entertainment | Beauty & Spas | Event Planning & Services | Hotels & Travel | Restaurants | Shopping | |
|---|---|---|---|---|---|---|
| Arts & Entertainment | 1.0 | 0.0000 | 0.000 | 1.0000 | 0.0 | 1.000 |
| Beauty & Spas | 0.0 | 1.0000 | 1.000 | 0.0043 | 1.0 | 0.000 |
| Event Planning & Services | 0.0 | 1.0000 | 1.000 | 0.0010 | 1.0 | 0.000 |
| Hotels & Travel | 1.0 | 0.0043 | 0.001 | 1.0000 | 0.0 | 0.554 |
| Restaurants | 0.0 | 1.0000 | 1.000 | 0.0000 | 1.0 | 0.000 |
| Shopping | 1.0 | 0.0000 | 0.000 | 0.5540 | 0.0 | 1.000 |
In additions to seeing if the sentiment metrics differed across business category, I also wanted to see if star rating differed by business category. I made two stacked bar charts to get of sense of this. For the sake of simplicity, I rounded down the business ratings to the nearest whole number. So, a rating of 3.5 has been rounded to 3.
The left chart shows that Event Planning & Services had the highest proportion of 5-star rated businesses and Restaurants had the lowest. Also, both Restaurants and Arts & Entertainment businesses had the highest proportion of 3- and 4-star rated businesses. The distributions of star ratings for each category appear to be different.
The chart on the right shows Beauty & Spa businesses getting the highest proportion 5-star reviews and Arts & Entertainment getting the lowest. However, Beauty & Spa businesses also appear to get a higher proportion of 1-star reviews, along with Hotels & Travel and Shopping businesses. Event Planning & Services, Restaurants, and Arts & Entertainment businesses had lower proportions of 1-star reviews. From this graph, it appears that Hotels & Travel and Shopping have similar distributions, along with Restaurants and Arts & Entertainment having similar distributions.
combined_filtered_exploded["stars_review"] = combined_filtered_exploded[
"stars_review"
].astype(int)
combined_filtered_exploded["stars_business_round"] = np.floor(
combined_filtered_exploded["stars_business"]
).astype(int)
sorted_categories_business = (
combined_filtered_exploded.groupby("categories")["stars_business_round"]
.value_counts(normalize=True)
.to_frame()
.reset_index(level="stars_business_round")[
combined_filtered_exploded.groupby("categories")[
"stars_business_round"
]
.value_counts(normalize=True)
.to_frame()
.reset_index(level="stars_business_round")["stars_business_round"]
== 5
]
.sort_values(by="proportion", ascending=False)
.index.tolist()
)
sorted_categories_review = (
combined_filtered_exploded.groupby("categories")["stars_review"]
.value_counts(normalize=True)
.to_frame()
.reset_index(level="stars_review")[
combined_filtered_exploded.groupby("categories")["stars_review"]
.value_counts(normalize=True)
.to_frame()
.reset_index(level="stars_review")["stars_review"]
== 5
]
.sort_values(by="proportion", ascending=False)
.index.tolist()
)
combined_filtered_exploded["stars_review"] = pd.Categorical(
combined_filtered_exploded["stars_review"]
)
combined_filtered_exploded["stars_business_round"] = pd.Categorical(
combined_filtered_exploded["stars_business_round"]
)
# Create a figure and subplots
fig, axes = plt.subplots(1, 2, figsize=(15, 6))
# Stacked Bar Chart of Business Categories with Business Star Ratings
combined_filtered_exploded["categories"] = pd.Categorical(
combined_filtered_exploded["categories"],
categories=sorted_categories_business,
ordered=True,
)
sns.histplot(
data=combined_filtered_exploded,
y="categories",
hue="stars_business_round",
multiple="fill",
stat="proportion",
palette=sns.color_palette("hls", 5),
hue_order=combined_filtered_exploded[
"stars_business_round"
].cat.categories[::-1],
discrete=True,
shrink=0.8,
ax=axes[0],
)
axes[0].set_title("Proportion of Business Star Ratings by Business Category")
axes[0].set_ylabel("Business Category")
axes[0].set_xlabel("Proportion")
sns.move_legend(
axes[0],
"upper center",
bbox_to_anchor=(0.5, -0.15),
ncol=5,
title="Star Rating",
reverse=True,
)
# Stacked Bar Chart of Business Categories with Review Star Ratings
combined_filtered_exploded["categories"] = pd.Categorical(
combined_filtered_exploded["categories"],
sorted_categories_review,
ordered=True,
)
sns.histplot(
data=combined_filtered_exploded,
y="categories",
hue="stars_review",
multiple="fill",
stat="proportion",
palette=sns.color_palette("hls", 5),
hue_order=combined_filtered_exploded["stars_review"].cat.categories[::-1],
discrete=True,
shrink=0.8,
ax=axes[1],
)
axes[1].set_title("Proportion of Review Star Ratings by Business Category")
axes[1].set_ylabel("Business Category")
axes[1].set_xlabel("Proportion")
sns.move_legend(
axes[1],
"upper center",
bbox_to_anchor=(0.5, -0.15),
ncol=5,
title="Star Rating",
reverse=True,
)
# Adjust layout and show the combined plot
plt.tight_layout()
plt.show()
To get a better sense of whether the patterns seen in the graphs are statistically significant, I used an overall Chi- Square test of homogeneity as well as pairwise chi- square tests. I had to add a small number to the contingency table in order to run these tests, since Arts & Entertainment did not contain any 1-star rated businesses. The overall Chi-Square test resulted in a test statistic of 5772.58, corresponding to a p-value of approximately 0. This indicates sufficient evidence to reject the null hypothesis that the distributions of business star ratings are the same across all categories. To understand how many differences there were and where, I ran pairwise chi-square tests (results shown below), using the Bonferroni correction to adjust for multiple testing. This returned significant p-values for all combinations, indicating sufficient evidence to reject the null and conclude that each category’s distribution of business star ratings differed from the others.
combined_filtered_exploded["categories"] = pd.Categorical(
combined_filtered_exploded["categories"],
category_order,
ordered=True,
)
# Chi-square
contingency_table = pd.crosstab(
combined_filtered_exploded["stars_business_round"],
combined_filtered_exploded["categories"],
)
contingency_table += 0.5
# Overall Chi-square
chi2, p, dof, expected = stats.chi2_contingency(contingency_table)
print("Chi-Square Tests for Business Star Ratings by Category")
print("Chi-Square Statistic:", chi2.round(2))
print("P-Value:", p)
# Perform Chi-Square test for each pair of categories
categories = contingency_table.columns
p_values = []
comparisons = []
for cat1, cat2 in combinations(categories, 2):
sub_table = contingency_table[[cat1, cat2]]
chi2, p, _, _ = stats.chi2_contingency(sub_table)
p_values.append(p)
comparisons.append(f"{cat1} vs. {cat2}")
# Adjust p-values using Bonferroni correction
_, p_adjusted, _, _ = smm.multipletests(p_values, method="bonferroni")
# Create a DataFrame for the results
post_hoc_results = pd.DataFrame(
{
"Comparison": comparisons,
"Adjusted P-Value": p_adjusted,
}
)
# Add significance column
post_hoc_results["Significant"] = post_hoc_results["Adjusted P-Value"] < 0.05
post_hoc_results = post_hoc_results.set_index("Comparison")
post_hoc_results.index.name = None
display(post_hoc_results)
Chi-Square Tests for Business Star Ratings by Category Chi-Square Statistic: 5772.58 P-Value: 0.0
| Adjusted P-Value | Significant | |
|---|---|---|
| Restaurants vs. Event Planning & Services | 0.000000e+00 | True |
| Restaurants vs. Shopping | 0.000000e+00 | True |
| Restaurants vs. Beauty & Spas | 0.000000e+00 | True |
| Restaurants vs. Arts & Entertainment | 4.753070e-59 | True |
| Restaurants vs. Hotels & Travel | 0.000000e+00 | True |
| Event Planning & Services vs. Shopping | 1.839501e-93 | True |
| Event Planning & Services vs. Beauty & Spas | 6.533687e-41 | True |
| Event Planning & Services vs. Arts & Entertainment | 1.876549e-87 | True |
| Event Planning & Services vs. Hotels & Travel | 8.423155e-54 | True |
| Shopping vs. Beauty & Spas | 1.556177e-28 | True |
| Shopping vs. Arts & Entertainment | 4.248370e-57 | True |
| Shopping vs. Hotels & Travel | 2.592184e-31 | True |
| Beauty & Spas vs. Arts & Entertainment | 3.539321e-76 | True |
| Beauty & Spas vs. Hotels & Travel | 1.637500e-24 | True |
| Arts & Entertainment vs. Hotels & Travel | 5.797439e-99 | True |
I ran the same tests for review star rating. I got an overall chi- square test statistic of 1322.4, corresponding to a p-value of approximately 0. This is sufficient evidence to reject the null hypothesis that the distributions of review star rating are the same across all business categories. The pairwise tests (results shown below) returned mostly statistically significant results, providing sufficient evidence to reject the null that their distributions are the same. There were two exceptions to this. One was Restaurants vs. Arts & Entertainment, giving a p- value of around 0.8, which is much higher than the standard cutoff of 0.05. This does not provide enough evidence to reject the null that their distributions are the same. The other exception was somewhat less clear: Shopping vs. Hotels & Travel. This test returned a p-value of around 0.041. While this is below the standard cutoff of 0.05 and thus sufficient evidence to reject the null, it is still somewhat high, so I wanted to draw attention to it. These exceptions make sense, given that these two combinations looked the closest in the stacked bar chart.
# Chi-square
contingency_table = pd.crosstab(
combined_filtered_exploded["stars_review"],
combined_filtered_exploded["categories"],
)
contingency_table += 0.5
# Overall Chi-square
chi2, p, dof, expected = stats.chi2_contingency(contingency_table)
print("Chi-Square Tests for Review Star Ratings by Category")
print("Chi-Square Statistic:", chi2.round(2))
print("P-Value:", p)
# Perform Chi-Square test for each pair of categories
categories = contingency_table.columns
p_values = []
comparisons = []
for cat1, cat2 in combinations(categories, 2):
sub_table = contingency_table[[cat1, cat2]]
chi2, p, _, _ = stats.chi2_contingency(sub_table)
p_values.append(p)
comparisons.append(f"{cat1} vs. {cat2}")
# Adjust p-values using Bonferroni correction
_, p_adjusted, _, _ = smm.multipletests(p_values, method="bonferroni")
# Create a DataFrame for the results
post_hoc_results = pd.DataFrame(
{
"Comparison": comparisons,
"Adjusted P-Value": p_adjusted,
}
)
# Add significance column
post_hoc_results["Significant"] = post_hoc_results["Adjusted P-Value"] < 0.05
post_hoc_results = post_hoc_results.set_index("Comparison")
post_hoc_results.index.name = None
display(post_hoc_results)
Chi-Square Tests for Review Star Ratings by Category Chi-Square Statistic: 1322.49 P-Value: 4.5204657190659196e-268
| Adjusted P-Value | Significant | |
|---|---|---|
| Restaurants vs. Event Planning & Services | 1.026970e-38 | True |
| Restaurants vs. Shopping | 1.665348e-88 | True |
| Restaurants vs. Beauty & Spas | 1.160864e-140 | True |
| Restaurants vs. Arts & Entertainment | 8.086487e-01 | False |
| Restaurants vs. Hotels & Travel | 3.864245e-45 | True |
| Event Planning & Services vs. Shopping | 1.246295e-23 | True |
| Event Planning & Services vs. Beauty & Spas | 1.014245e-40 | True |
| Event Planning & Services vs. Arts & Entertainment | 2.167515e-16 | True |
| Event Planning & Services vs. Hotels & Travel | 1.708936e-16 | True |
| Shopping vs. Beauty & Spas | 8.806005e-23 | True |
| Shopping vs. Arts & Entertainment | 3.255638e-32 | True |
| Shopping vs. Hotels & Travel | 4.090113e-02 | True |
| Beauty & Spas vs. Arts & Entertainment | 1.097079e-77 | True |
| Beauty & Spas vs. Hotels & Travel | 5.170178e-12 | True |
| Arts & Entertainment vs. Hotels & Travel | 4.089392e-29 | True |
I also wanted to get a sense of if/how review star ratings changed over time for each business category. From the graph below, it appears that reviews for Event Planning & Services and Shopping both averaged slightly above 4 stars in 2006. However, as time went on, reviews for Event Planning & Services tended to be higher on average, with the average review rating in 2022 for this category being around 4.7. Shopping went the other way, with the average review rating in 2022 for this category being around 3.4. Restaurants is the only category besides Event Planning & Services to maintain a steady increase in average review star rating. In 2005, its average review rating was around 3.6, rising to around 4.1 in 2022. At first, Beauty & Shopping showed a steady increase, with an average review rating around 3.4 in 2007 and around 4.4 in 2016. However, it started dropping afterward, with an average review rating of around 3.8 in 2022. Arts & Entertainment followed a similar trajectory, starting with around a 3.4 in 2007 and rising to around a 4.2 in 2017. It also started dropping after that, with around a 3.9 in 2022. Finally, Hotels & Travel also had a rise and fall, albeit less pronounced, beginning with an average review rating of around 3.6 in 2009, rising to around 3.9 in 2015, maintaining that until 2017, then falling to 3.5 in 2022.
combined["date"] = pd.to_datetime(combined["date"], errors="coerce")
combined_filtered_exploded["date"] = pd.to_datetime(
combined_filtered_exploded["date"], errors="coerce"
)
# combined_shorter = combined_filtered_exploded[
# (combined_filtered_exploded["date"] > (pd.to_datetime('2020-01-01 00:00:00') - pd.DateOffset(years=2))) &
# (combined_filtered_exploded["date"] < pd.to_datetime('2020-01-01 00:00:00'))
# ]
# combined_shorter = combined_filtered_exploded[
# (
# combined_filtered_exploded["date"]
# > (pd.to_datetime("2020-01-01 00:00:00") - pd.DateOffset(years=5))
# )
# & (
# combined_filtered_exploded["date"]
# < pd.to_datetime("2020-01-01 00:00:00")
# )
# ]
combined_filtered_exploded["stars_review"] = pd.to_numeric(
combined_filtered_exploded["stars_review"]
)
fig = px.scatter(
combined_filtered_exploded,
x="date",
y="stars_review",
trendline="lowess",
title="Review Star Ratings Over Time",
color="categories",
category_orders={"categories": category_order},
)
# Loop through fig.data and adjust the legend visibility
for trace in fig.data:
if trace.mode == "markers": # Scatter points
trace.showlegend = False # Do not show in the legend
elif trace.mode == "lines": # Trendline (LOWESS)
trace.showlegend = True # Show the trendline in the legend
fig.data = fig.data[1::2]
fig.update_layout(
xaxis_title="Date",
yaxis_title="Star Ratings",
font_family="Times New Roman",
font_color="black",
legend_title="Category",
showlegend=True,
legend=dict(
x=1.35, # Horizontal position of the legend
y=0.5, # Vertical position of the legend
xanchor="right", # Anchor the legend horizontally to the right
yanchor="middle", # Anchor the legend vertically to the center
),
)
fig.update_yaxes(range=[3, 5], showline=True, linecolor="black")
fig.update_xaxes(
range=["2005-06-01", "2022-01-31"], showline=True, linecolor="black"
)
fig.show(renderer="notebook")
Finally, I wanted to get a sense of how the sentiment metrics differed over time for each business category. One thing to note is that absolute polarity over time mirrors polarity fairly strongly since polarity had more positive values than negative.
All of the sentiment metrics showed a slow, steady rise for Restaurants. Reviews got more positive, more opinion-based, and more strongly expressed. The polarity and absolute polarity of Event Planning & Services reviews remained somewhat steady with a slight upward trend, indicating slightly more positive and strongly expressed reviews. The subjectivity followed a different trend, becoming more opinion-based, then starting to become more factual again. The sentiment metrics for Shopping showed a slight increase, then decrease, becoming more positive, opinion-based, and strongly expressed, then more negative, factual, and less intense.
The polarity and absolute polarity for Beauty & Spas went up and then down again, peaking drastically around 2016. The subjectivity followed the same trend, but in a much less pronounced way.
The polarity and absolute polarity for Arts & Entertainment followed the same trends as Beauty & Spas, but with a less pronounced peak. Subjectivity also followed the same trend as Beauty & Spas, but with a more pronounced peak at around 2017.
Finally, polarity and absolute polarity for Hotels & Travel showed a steady rise, becoming slightly more positive and strongly expressed. Subjectivity showed a more substantial rise, becoming more opinion-based.
fig1 = px.scatter(
combined_filtered_exploded,
x="date",
y="Polarity",
trendline="lowess",
title="Review Polarity Over Time",
color="categories",
category_orders={"categories": category_order},
)
# Loop through fig.data and adjust the legend visibility
for trace in fig1.data:
if trace.mode == "markers": # Scatter points
trace.showlegend = False # Do not show in the legend
elif trace.mode == "lines": # Trendline (LOWESS)
trace.showlegend = True # Show the trendline in the legend
fig1.data = fig1.data[1::2]
fig1.update_layout(
xaxis_title="Date",
yaxis_title="Polarity",
font_family="Times New Roman",
font_color="black",
legend_title="Category",
showlegend=True,
legend=dict(
x=1.35, # Horizontal position of the legend
y=0, # Vertical position of the legend
xanchor="right", # Anchor the legend horizontally to the right
yanchor="bottom", # Anchor the legend vertically to the center
),
)
fig2 = px.scatter(
combined_filtered_exploded,
x="date",
y="Subjectivity",
trendline="lowess",
title="Review Subjectivity Over Time",
color="categories",
category_orders={"categories": category_order},
)
fig2.data = fig2.data[1::2]
fig2.update_layout(
xaxis_title="Date",
yaxis_title="Subjectivity",
font_family="Times New Roman",
font_color="black",
legend_title="Category",
)
fig3 = px.scatter(
combined_filtered_exploded,
x="date",
y="Abs_polarity",
trendline="lowess",
title="Review Absolute Polarity Over Time",
color="categories",
category_orders={"categories": category_order},
)
fig3.data = fig3.data[1::2]
fig3.update_layout(
xaxis_title="Date",
yaxis_title="Absolute Polarity",
font_family="Times New Roman",
font_color="black",
)
# Create subplots with two columns
fig = make_subplots(
rows=2,
cols=2,
subplot_titles=(
"Review Polarity Over Time",
"Review Subjectivity Over Time",
"Review Absolute Polarity Over Time",
), # Titles for each subplot
)
# Add the traces from fig1 to the first subplot (column 1)
for trace in fig1.data:
fig.add_trace(trace, row=1, col=1)
# Add the traces from fig2 to the second subplot (column 2)
for trace in fig2.data:
fig.add_trace(trace, row=1, col=2)
for trace in fig3.data:
fig.add_trace(trace, row=2, col=1)
fig.update_yaxes(
range=[0.15, 0.3], showline=True, linecolor="black", row=1, col=1
)
fig.update_yaxes(
range=[0.52, 0.58], showline=True, linecolor="black", row=1, col=2
)
fig.update_yaxes(
range=[0.15, 0.3], showline=True, linecolor="black", row=2, col=1
)
fig.update_xaxes(
range=["2005-06-01", "2022-01-31"], showline=True, linecolor="black"
)
fig.show(renderer="notebook")
In this paper, I noticed an expected positive trend between polarity and review star rating. I also noticed that the median polarity for a one-star rating was around 0, indicating that most people tended to keep their reviews from sounding too negative. I also found a somewhat more surprising positive trend between subjectivity and review star rating.
I also explored these sentiment metrics with regard to business category. I found that for all three metrics, Arts & Entertainment was not statistically different from Hotels & Travel or Shopping. Beauty & Spas was not statistically different from Event Planning or Restaurants. Event Planning was not significantly different from Restaurants. Hotels & Travel was not statistically different from Shopping. All other combinations were statistically significantly different.
I explored how the distributions of business star ratings and review star ratings differed by business category. I found that the distributions of business star ratings were significantly different between each category. I also found that the distributions of review star ratings were significantly different between each category except for between Restaurants and Hotels & Travel and the possible exception between Shopping and Hotels & Travel.
Finally, I explored how review star ratings as well as the other sentiment metrics differed over time depending on business category.
It is important to remember that this analysis used TextBlob, a natural language processing tool that automatically calculated the polarity and subjectivity of each review. While this avoided potential bias compared to doing it by hand, it also means that some of those values may not have been accurate. This should be considered when interpreting the results.
This analysis also did not explore the reasons behind the findings. It is possible that the differences I found are due to demographic differences in who tends to frequent business in each business category. It may be fruitful for future research to focus on discovering potential reasons behind these findings.